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Abstract 

The largest eigenvalue of the adjacency matrix of a network 
(referred to as the spectral radius) is an important metric in 
its own right. Further, for several models of epidemic spread 
on networks (e.g., the ‘flu-like’ SIS model), it has been shown 
that an epidemic dies out quickly if the spectral radius of the 
graph is below a certain threshold that depends on the model 
parameters. This motivates a strategy to control epidemic 
spread by reducing the spectral radius of the underlying 
network. 

In this paper, we develop a suite of provable approxima¬ 
tion algorithms for reducing the spectral radius by removing 
the minimum cost set of edges (modeling quarantining) or 
nodes (modeling vaccinations), with different time and qual¬ 
ity tradeoffs. Our main algorithm, GreedyWalk, is based 
on the idea of hitting closed walks of a given length, and gives 
an 0(log^ n)-approximation, where n denotes the number of 
nodes; it also performs much better in practice compared to 
all prior heuristics proposed for this problem. We further 
present a novel sparsification method to improve its running 
time. 

In addition, we give a new primal-dual based algorithm 
with an even better approximation guarantee (O(logn)), 
albeit with slower running time. We also give lower bounds 
on the worst-case performance of some of the popular 
heuristics. Finally we demonstrate the applicability of our 
algorithms and the properties of our solutions via extensive 
experiments on multiple synthetic and real networks. 

1 Introduction 

Given a contact network, which contacts should we 
remove to contain the spread of a virus? Equivalently, 
in a computer network, which connections should we cut 
to prevent the spread of malware? Designing effective 
and low cost interventions are fundamental challenges 
in public health and network security. Epidemics are 
commonly modeled by stochastic diffusion processes, 
such as the so-called ‘SIS’ (flu-like) and ‘SIR’ (mumps- 
like) models on networks (more in Section [2). An 
important result that highlights the impact of the 
network structure on the dynamics is that epidemics 
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die out “quickly” if p{G) < T, where p{G) is the 
spectral radius (or the largest eigenvalue) of graph G, 
and T is a threshold that depends on the disease model 
[HIMIISI]- This motivates the following strategy for 
controlling an epidemic: remove edges (quarantining) or 
nodes (vaccinating) to reduce the spectral radius below 
a threshold T —we refer to this as the spectral radius 
minimization (SRM) problem, with variants depending 
on whether edges are removed (the SRME problem) 
or whether nodes are removed (the SRMN problem). 
Van Mieghem et al. [5H] and Tong et al. EH prove 
that this problem is NP-complete. They also study 
two heuristics for it, one based on the components 
of the first eigenvector (EigenScore) and another 
based on degrees (ProductDegree). However, no 
rigorous approximations were known for the SRME or 
the SRMN problems. 

Our main contributions. 

1. Lower bounds on the worst-case performance 
of heuristics: We show that the ProductDegree, 
EigenScore and Pagerank heuristics (defined for¬ 
mally in Section [5]) can perform quite poorly in general. 
We demonstrate graph instances where these heuristics 
give solutions of cost H(^) times the optimal, where n 
is the number of nodes in the graph. 

2. Provable approximation algorithms: We present 
two bicriteria approximation algorithms for the SRME 
and SRMN problems, with varying approximation 
quality and running time tradeoffs. Our first algorithm, 
GreedyWalk, is based on hitting closed walks in G. 
We show this algorithm has an approximation bound 
of O(lognlogA) times optimal for the cost of edges 
removed, while ensuring that the spectral radius be¬ 
comes at most (1 -f e) times the threshold, for e ar¬ 
bitrarily small (here A denotes the maximum node de¬ 
gree in the graph). We also design a variant, Greedy- 
WalkSparse, that performs careful sparsification of 
the graph, leading to similar asymptotic guarantees, but 
better running time, especially when the threshold T is 
small. We then develop algorithm PrimalDual, which 
improves this approximation bound to an O(logn) us¬ 
ing a more sophisticated primal-dual approach, at the 



expense of a slightly higher (but polynomial) running 
time. 

3. Extensions: We consider two natural extensions 
of the SRME problem: (i) non-uniform transmission 
rates on edges and (ii) node version SRMN. We show 
that our methods extend to these variations too. 

4. Empirical analysis: We conduct an extensive 
experimental evaluation of GreedyWalk, a simpli- 
hed version of PrimalDual and different heuristics 
that have been proposed for epidemic containment on 
a diverse collection of synthetic and real networks. 
These heuristics involve picking edges e = {i,j) in 
non-increasing order of some kind of score; the specihc 
heuristics we compare include: (i) ProductDegree, 
(ii) EigenScore, (iii) LinePagerank, and (iv) Hy¬ 
brid, which picks the edge based on either the eigen- 
score or the product-degree ordering, depending on 
the maximum decrease in eigenvalue. We hnd that 
GreedyWalk performs better than all the heuristics 
in all the networks we study. We analyze Greedy¬ 
Walk for walks of length k = 0(logn); in practice, we 
found that the performance degrades signihcantly as k 
is reduced. 

Organization. The background and notation are de- 
hned in Section [2] Sections Em and E] cover Greedy¬ 
Walk, GreedyWalkSparse and PrimalDual al¬ 
gorithms, respectively, for the SMRE problem; the 
SRMN problem is discussed in section El Some of the 
algorithmic details and proofs are omitted for brevity 
and are available in |85| . Lower bounds for some heuris¬ 
tics and the experimental results are discussed in Sec¬ 
tions [7] and El respectively. We discuss the related work 
in Section El and conclude in Section [TOl 

2 Preliminaries 

We consider undirected graphs G = {V,E), and inter¬ 
ventions to control the spread of epidemics— vaccina¬ 
tion (modeled by removal of nodes) and quarantining 
(modeled by removal of edges). There can be different 
costs for the removal of nodes and edges (denoted by 
c{v) and c(e), respectively), e.g., depending on their de¬ 
mographics, as estimated by |26]. For a set E' C E, 
c{E') = EeSE' <e) denotes the total cost of the set E' 
(similarly for node subsets). 

There are a number of models for epidemic 
spread; we focus on the fundamental SIS (Susceptible- 
Infectious-Susceptible) model, which is dehned in the 
following manner. Nodes are in susceptible (S) or in¬ 
fectious (I) state. Each infected node u (in state I) 
causes each susceptible neighbor v (in state S) to be¬ 
come infected at rate Puv Further, each infected node 
u switches to the susceptible state at rate <5. In this pa¬ 
per, we assume a uniform rate = /3 for all (m, w) £ E\ 


Table 1: Notations 


G = (U, E) 
n=\V\ 
d{v, G) 

A(G) 

A = AG 

G[E'] 

Ai(G) 

p(G) = p{A) 

Graph representing a contact network 

Total number of nodes in G 

Degree of node u in G 

Maximum node degree in G 

Adjacency matrix of G 

Subgraph of G induced on E' C E 

2 th largest Eigenvalue of 

Ai(G), spectral radius of G 

c(-) 

Cost of a vertex or edge of G 

0 

Infection rate 

& 

Recovery rate 

T 

Epidemic Threshold, T = ^ 

T 

Time to epidemic extinction 

VPfc(G) 

Set of closed walks of length k in G 

WAG) 

WAG) = \'WAG)\ 

nodes(Li;) 

number of distinct nodes in walk w 

walks(x, G, k) 

Number of closed /c-walks in G containing 
edge (or vertex) x 

Eopt{T) 

Optimal solution to SRME(G, c(-), T) 


in this case, we dehne a threshold T = (5//3, which 
characterizes the time to extinction. Let A = de¬ 
note the adjacency matrix of G, and let n = \V\. Let 
Xi{G) denote the ith largest eigenvalue of A, and let 
p{A) = Ai(H) denote the spectral radius of A. Since G 
is undirected, it follows that all eigenvalues are real, and 
p{A) > 0 (see, e.g.. Chapter 3 of [21] )■ Ganesh et al. [T4| 
showed that the epidemic dies out in time 0{ 
if p{A) < T in the SIS model, with high probability; 
this threshold was also observed by EH]. Prakash et al. 
m show this condition holds for a broad class of other 
epidemic models, including the SIR model (which con¬ 
tains the ‘Recovered’ state). Now we formally dehne the 
SRM problem. 

Definition 2.1. Spegtral Radius Minimization 
problems (SRME and SRMN): Given an undirected 
graph G = {V,E), with cost c(e) for each edge e, and 
a threshold T, the goal of the SRME(G, c(-), T) prob¬ 
lem is to find the cheapest subset E' Q E such that 
Ai(G[£' \E']) < T. We refer to the node version of this 
problem as SRMN(G,c(-),T). 

We discuss some notation that will be used in the rest of 
the paper. £'opt(F) denotes an optimal solution to the 
SRME(G,c(-),T) problem. Let Wfe(G) denote the set 
of closed walks of length k in G; let Wk{G) = |Wfc(G)|. 
For a walk w, let nodes(?n) denote the number of distinct 
nodes in w. A standard result (see, e.g., Chapter 3 of 
m) is the following: 

n 

(2.1) nodes(w) = = 

weWkiG) I i=l 

The number of walks in Wk{G) containing a node i is 







For a graph G, let walks(e, G, k) denote the number 
of closed fc-walks in G containing e = Then, 

walks(e, G, k) = A^~^. We say that an edge set E' hits 
a walk w if w contains an edge from E'. Similarly, for 
a node v, let walks(t;, G, k) denote the number of closed 
fe-walks in G containing v. Then, walks(*,G, fe) = Ak. 
Table [1] summarizes the frequently used notations. 

3 GreedyWalk: 0(log n log A)-approximation 
Main idea. Our starting point is the connection 
between the number of closed walks in a graph and 
the sum of powers of the eigenvalues in (12.11) . We try 
to reduce the spectral radius by reducing the number 
of closed walks of length k in the graph, by removing 
edges (see Algorithm 1). This, in turn, can be viewed 
as a partial covering problem^ Our basic idea extends 
to other versions, as discussed later in Section [51 

Algorithm 1 GreedyWalk (high level description) 
Input: G, T, c(-), k even 
Output: Edge set E' 

1: Initialize E' (f) 

2: while Wk{G[E \E'])> nT'' do 
3: r ^ Wk{G[E \ E']) - nT'^ 

4: Pick e e E\E' that maximizes 

5: E' -(-E'U {e} 

6: end while 


The Lemma below proves the approximation bound 
for any solution (say E') from GreedyWalk. Let 
G' = G[E \ E'] denote the graph resulting after the 
removal of edges in E'. Our proof involves three steps: 
(1) Proving the bound on Ai(G'); (2) Relating c{E') to 
the cost of the optimum solution to the partial covering 
problem which ensures that the number of walks in the 
residual graph is at most (3) Showing that the 

optimum solution to the SRME problem also ensures 
that at most nT^ remain in the residual graph. 

Lemma 3.1. Let E' denote the set of edges found by 
Algorithm GreedyWalk. Given any constant e > 0, 
let k be an even integer larger than • Then, 

we have Ai(G[i5 \ E']) < (1 + e)T, and c{E') = 
0(c(Aopt(T)) log n log A). 

Proof. We follow the proof scheme mentioned above. 
By the stopping condition of the algorithm, we have 

^This is a variation of the set cover problem, in which an 
instance consists of (i) a set H of elements, (ii) a collection 
S = C 2^ of sets, (iii) cost(S'j) for each Si G S, 

and (iv) a parameter r < \H\. The objective is to find the 
cheapest collection of sets from S which cover at least r elements. 
Slavic m shows that a greedy algorithm gives an 0(log|i^|) 
approximation. 


Wk{G') ^ From ([23]), we have = 

= Ea,GW(G') nodes(w) < fcWfc(G'), which im¬ 
plies Xi{G')'‘ ^ nkT^. Further, since k is even 

(by assumption), Ai(G') > 0, so that Ai(G')^ < 
J27=lX^iG')'^ < nkT^. This implies Ai(G') < 
g(iogn-tiogfe)/fej.^ Since k = log n/log (1 -h e/3), we have 
(logn -I- logA:)/A: < 2 log (1-|- e/3), so that Ai(G') < 
(l-he/3)2T< (l-he)T. 

Next, we derive a bound for c{E'). Observe that 
the algorithm can be viewed as solving a partial cover 
problem, in which (i) the set H of elements corre¬ 
sponds to walks in Wk{G), and (ii) there is a set cor¬ 
responding to each edge e G E consisting of all the 
walks in Wk{G) that contain e. Following the analy¬ 
sis of the greedy algorithm for partial cover |36], we 
have c{E') = 0(c(i?HiTOPT) log |iF|), where Ehitopt 
denotes the optimum solution for this covering in¬ 
stance. Since A denotes the maximum node degree, 
we have El = Wk{G) ^ nA^. We show below that 
c(Ehitopt) < c(Aopt(T)); it follows that c{E') = 
0{c{EovT{T))\ogn\ogA). 

Finally, we prove that c(i?HiTOPT) < c(i?oPT(T)). 
By definition of Eovt{T), we have Ai(G[i? — 
iloPT(T)]) < T. Let G" = G[F; - F;opt(T)]. Then, 
we have 

n 

Wk{G'') < Xi{G" f < nXiiG”)’^ < nP’^. 

This implies i?oPT(T) hits at least Wk{G) — nT^ walks, 
so that c(Ahitopt) < c(Aopt(T)). 

Effect of the walk length k. We set the walk length 
k = alogn for some constant a in Algorithm Greedy¬ 
Walk; understanding the effect of fc is a natural ques¬ 
tion. From the proof of Lemma 13.11 it follows that 
Ai(G[A\i5']) can be bounded by (nfc)^/^T for any choice 
of fc, as long as it is even. This bound becomes worse 
as fc becomes smaller, e.g., it is 0{^/n) for fc = 2. This 
is borne out in the experiments in Section |8l 

In order to complete the description of Greedy¬ 
Walk (Algorithm 1 ), we need to design an efficient 
method to determine the edge which maximizes the 
quantity in line 4. We discuss two methods below. 

3.1 Matrix multiplication approach for imple¬ 
menting GreedyWalk. Note that walks(e, G, fc) = 
We use matrix multiplication to compute A^~^ 
once for each iteration of the while loop in line 2 of 
Algorithm 1 . In line 4, we iterate over all edges, in or¬ 
der to compute the edge e that maximizes the given 
ratio. For fc = O(logn), A^~^ can be computed in 
time 0(n‘^ log log n), where ui < 2.37 is the exponent 









for the running time of the best matrix multiplica¬ 
tion algorithm (40]. Therefore, each iteration involves 
0{n‘^ log log n + m) = log log n) time. This gives a 

total running time of 0{n‘^ log log n|£loPT| log^ n), since 
only 0(|iiloPT| log^ n) edges are removed. One draw¬ 
back with this approach is the high (super-linear) space 
complexity, even with the best matrix multiplication 
methods, in general. 

3.2 Dynamic programming approach for imple¬ 
menting GreedyWalk. When the graphs are very 
sparse (0(n) edges), we adapt a dynamic program¬ 
ming approach to compute walks(e, G, k) for an edge 
e and more efficiently select the edge that maximizes 
walks(e, G[iil \ £■'], fc)/c(e) in line 4 of Algorithm 1 . Al¬ 
though, potentially walks(e, G, k) needs to be computed 
for each edge e € E\E', in practice it suffices to compute 
it for only a small subset oi E\E'. We make use of the 
fact that walks(e, G', k) < walks(e, G, k) for any sub¬ 
graph G'. The approach is briefly as follows. Initially 
we compute walks(e, G, k) for each e G E and arrange 
the edges in non-ascending order of their walks(e, G, k) 
value, ei, 62 ,..., e|£;|. After the first edge ( i.e. ei in the 
first iteration) is removed, walks(e, G', fc) is computed 
on the residual graph G' only for some consecutive edges 
in that order upto some such that walks(ei, G', k) > 
walks(ei+i, G, k). Edges 62 ,..., are reordered based on 
the recomputed walk numbers, walks(ei, G', k) and then 
the same steps are repeated. The approach takes 0{n) 
space and 0{n?k) time assuming the number of edges 
is 0(n) in real world large networks. The detailed algo¬ 
rithm and the analysis is given in the appendix lA. II 

4 Using sparsification for faster running time: 

Algorithm GreedyWalkSparse 

The efficiency of Algorithm GreedyWalk can be im¬ 
proved if the number of edges in the graph can be re¬ 
duced. This can be achieved by two pruning steps - 
pruning edges such that in the residual graph (i) no 
node has degree more than T^, and (ii) there is no T- 
core; the T-core of a graph denotes the maximal sub¬ 
graph of G with minimum degree T (see, e.g., my 
We will refer to these steps as MaxDegreeReduc- 
TION and DensityReduction respectively. This leads 
to sparser graphs, without affecting the asymptotic ap¬ 
proximation guarantees. The algorithm involves two 
prunning steps: MaxDegreeReduction and Densi¬ 
tyReduction; the procedure is described in Algorithm 
GreedyWalkSparse. 

Lemma 4.1. Let Ei and E 2 denote the set of edges 
removed in the pruning steps MaxDegreeReduction 
and DensityReduction, respectively. Then, c{Ei) 


Algorithm 2 Algorithm GreedyWalkSparse 
I nput: G, T, c(-) 

Output: Edge set E' 

1 : Initialize Gr = G. 

2: //Pruning step 1: MaxDegreeReduction 
3; Let Vt 2 = {u : d(u, G) ^ T^}. 

4; for V G Vt2 do 

5; if d(u, Gr) > then 

6; Let e„p,..., e.„^d(D,Gr) be the edges incident on 
V ordered so that c{eyp) < ... < c(e„^d(i;,Gr))- 
7: Let Ey = {Cyp, . . . , e.„^d(i;.Gr)-T 2 + l}- 

8: El ^ El U Ey and E{Gr) -G- E{Gr) \ Ey. 

9: end if 

10: end for 

11: //pruning step 2: DensityReduction 
12 : Let Gt denote the T-core of Gy 
13: Order the edges ei,..., e^E{CT)\ non-decreasing 
order of cost. 

14: E2^{ei\i< \E{Ct)\ - T|U(Gt )|/2 + 1 } 

15: //GreedyWalk on Pruned Graph: 

16: E{Gr) ^ E{Gr) - Ei - E 2 
17: E 3 = GREEDYWALK(Gr., T,c(-)) 

18: E' ^ El U E 2 U E 3 


and c(E 2 ) are both at most 2c(Eopt(E)). 

Proof. Since ^/A{G^ ^ l2I]j which implies 

A(G[E — Eopt(E)]) < T^. Therefore, c({e G N{v) fl 

£’opt( 7’)}) > c(e„j), where the sum 

is the minimum cost of edges that can be removed 
to ensure that the degree of v becomes at most T^. 
Therefore, 

d{v,G}-T'^ + l 

c{Ei) = 

v&Vrr 2 

< ^2 c({e G N{v) n Eopt(7’)}) 

< c{Eopt(T)) 

Recall that the second pruning step is applied on Gy. 

For bounding c(E 2 ), we use another lower bound for Ai: 
for any induced subgraph H of Gy, J2veviH) tv{H)\ ^ 
Ai(Gr). Therefore, the existence of a T-core Gt implies 
that Ai(Gr.) > T. Since the average degree of Gt in 
the residual graph is at least T, it implies that at least 
|E(Gt)| — T|R(Gt)|/2 + 1 edges must be removed from 
Gt. Therefore, 

|£;(CT)|-T|y(CT)|/2+i 

c(E2) = Y. ^ c(Eopt(T) n E(Gt)) , 












where, the Cj correspond to the first \E(Ct)\ — 
T\V{Ct)\/‘^ + 1 edges of least cost. Hence proved. 

By Lemma 14.11 it follows that the approximation 
bounds of Lemma 13.11 still hold. However, the prun¬ 
ing steps reduce the number of edges, thereby speeding 
the implementation of GreedyWalk. We discuss the 
empirical performance of pruning in Section [S] We show 
below that pruning also improves the approximation 
factor marginally from O(lognlogA) to O(lognlogr) 
which could be significant when n is large and T <C A. 

Lemma 4.2. Let E' denote the set of edges found by 
Algorithm GreedyWalkSparse. Given any constant 
e > 0, let k be an even integer larger than 
Then, we have \i{G\E \ E']) < (1 -I- e)T, and c{E') = 
0{c{Eopt(T)) logn logT). 

Proof. From Lemma |4.11 the number of edges removed 
is at most 2c{Eopt)- The residual graph Gr has 
maximum degree less than Therefore, applying 

Lemma lO on Gr, it follows that the number of edges 
removed is 0{c{Eop'p{T))\ogn\ogT). Hence, the total 
number of edges removed by GreedyWalkSparse is 
at most 2c{Eopt{T)) + O {c{E opt (T)) log n log T) = 
0{c{EoPTiT))lognlogT). 

5 PrimalDual: O(log'n)-approximation 
Main idea: The approach of [13] gives an /- 
approximation for the partial covering problem, where 
/ denotes the maximum number of sets that contain any 
element in the set system. As in the proof of Lemma 
o in our reduction from the SRME problem to partial 
covering, elements correspond to all the closed walks of 
length k = O(logn), while sets correspond to edges; for 
an edge e, the corresponding set Sg consists of all the 
walks w that are hit by e. In this reduction, each walk w 
lies in k sets; therefore, / = O(logn) for this set system. 
Therefore, the approach of [13] could improve the ap¬ 
proximation factor. Unfortunately, our set system has 
size so that the algorithm of m cannot be 

used directly to get a polynomial time algorithm. 

The algorithm of Gandhi et al. m uses a primal- 
dual approach, which maintains dual variables u{w) for 
each element (i.e., walk); these are increased gradually, 
and a set (i.e., an edge) is picked if the sum of duals 
corresponding to the elements in the set equals its cost. 
We now discuss how to adapt this algorithm to run in 
polynomial time, and only focus on polynomial time 
implementation of the PrimalDual subroutine of [13] 
in detail here. However, we also present the set cover 
algorithm Hit Walks for completeness. This algorithm 
iterates over all edges and invokes PrimalDual in each 
iteration to obtain a candidate set of edges to remove 


Algorithm 3 PrimalDual(T', 5', c', tr') 

Output: Edge set E" 

1 : Initialize Ze = 0 for all Ss £ S', C f. 

2 : //u{w) =0 for all walks w in G'. 

3: while C is not cr'-feasible do 

4: X = { waiMeToCfc) ® 

which the minimum is reached. 

5: G^GU{Se} 

6 : For each e' G E \ E"\ z^' = Zg,/ + x ■ walks(e', G', k) 

7: / /u(w) = u{w) + X for all walks w in G' that 

pass through e' 

8: E' <- E" U {e} 

9: end while 


and finally chooses the set with minimum cost. E', S', c! 
and a' denote the set of elements (walks) to be covered, 
the sets (corresponding to edges that can be chosen), the 
costs corresponding to the sets/edges and the number of 
elements (walks) that need to be covered, respectively. 
A subset C C 5' is cr'-feasible if | S'e| > E. 

Let u{w) denote the dual variables corresponding to 
the walks w, these are not maintained in the algorithm 
explicitly, but assigned in the comments, for use in the 
analysis. This algorithm does not explicitly update the 


Algorithm 4 HitWalks(T, 5, c, cr) 

Input: Set of all fc-closed walks T, walks corresponding to 
edges S, edge cost set c, number of walks to hit a 
Output: Edge set E' 

1: Sort the edges of G in increasing order of their costs. 

2: Initialize Vj, c'(ej) t— oo 
3: for j t— 1 to m do 

4: c'ifij) t— cifij) and compute walks(ej, G, k) 

5: cSj t—c». //cost of edge set in this iteration 

6: if ISi U 52 U • • • Sj I ^ (T then 

7: E'j = {cj} U PrimalDual(T \ Sj, S \ Sj,c',a — 

walks(ej, G, fc)) 

8: CSj = c{E'j) 

9: end if 

10: i = minj cSj 

11 : E' = E'i 

12: end for 


dual variables, but the edges are picked in the same 
sequence as in [13] . 

Lemma 5.1. Given any constant e > 0, let k be an even 
integer larger than . The dual variables u{w) 

in algorithm PrimalDual are maintained and updated 
as in na, and the edge e picked in each iteration is 
the same. We have c{E') = 0(c(£1opt) log n) and 
\i{G[E\E']) < (l + e)r. 

Proof. Instead of updating the dual variable u{w) for 















each element (walk) w, as done in [13], the variable 
Ze corresponding to each set (edge) e is updated in 
algorithm PrimalDual at the end of each iteration. 
It is easy to see that, the following is an invariant at 
the end of each iteration, Ze = X^uieSe Also note 

that, a set e is picked into the cover in PrimalDual, 
whenever Zg = Cg. 

Therefore, increasing the m(w)’s has the same effect 
as increasing the z^s in terms of picking the sets into 
the cover and both the algorithm PrimalDual and the 
one in m chooses the same set in each iteration. 

6 Node Version 

Our discussion so far has focused on the SRME prob¬ 
lem. We now consider extensions which capture two 
kinds of issues arising in practice. 

1. Non-uniform transmission rates. In general, 
the transmission rate /3 is not constant for all the edges. 
The transmission rate /3y for edge (i-,j) depends on 
individual properties, especially the demographics of 
the end-points i and j, such as age, e.g., |5^. Let 
B = B{G) = (Pij) denote the matrix of the transmission 
rates. This gives us the SRME-NONUNIFORM problem, 
which is defined as follows: Given an undirected graph 
G = {V,E), with transmission rate /3y for each (f, j) € 
E and recovery rate S, find the smallest set E' C E 
such that p{B{G[E — E'])) < 6. We extend the 
spectral radius characterization of [iiiMiini to handle 
this setting, and show that GreedyWalk can also 
be adapted for solving SRME-NONUNIFORM, with the 
same guarantees. The details of the algorithm, lemma 
and proofs are discussed in the appendix I A. 21 

2. The node removal version (SRMN prob¬ 
lem). We extend the GreedyWalk algorithm in a 
natural manner to work for SRMN, with the same ap¬ 
proximation guarantees. For the details, please see the 
appendix IA.3I 

7 Popular heuristics and lower bounds 

A number of heuristics have been developed for control¬ 
ling the spread of epidemics- these are discussed below. 
All these heuristics involve ordering the edges based on 
some kind of score, and then selecting the top few edges 
based on this score. We describe the score function in 
each heuristic. 

1. ProductDegree ([28]): The score for edge e = 
{u,v) is defined as deg(it) x deg(z;). Edges are 
removed in non-increasing order of this score. 

2. EigenScore (I281I37I): Let X be the eigenvector 
corresponding to the first eigenvalue of the graph. 
The score for edge e = {u,v) is |a:(it) x a:(u)|. 


3. LinePagerank: This method uses the linegraph 
L{G) = {E,F) of graph G = (y,E), where (e,e') € 
F ii e,e' G E have a common endpoint. We define 
the score of edge e G E as the pagerank of the 
corresponding node in L{G). 

As we find in Section 18.21 these heuristics work 
well for different kinds of networks. We design another 
heuristic. Hybrid, which picks the best of the Eigen- 
Score and ProductDegree methods. The edges are 
ordered in the following manner: (1) Let tti, ..., tt^ and 
be orderings of edges in the Eigenscore 
and ProductDegree algorithms, respectively. (2) Ini¬ 
tialize 1 = 0 and j = 0, and (3) from the edges 7r(f) and 
p{j), remove the one which decreases the max eigenvalue 
of the residual graph more. Increment the correspond¬ 
ing index. 

We have examined the worst case performance of 
these heuristics. Two of these, namely, EigenScore 
and ProductDegree, have been used specifically for 
reducing the spectral radius, e.g., |2i[32j. No formal 
analysis is known for any of these heuristics in the 
context of the SRME or SRMN problems; some of 
them seem to work pretty well on real world networks. 
We show that the worst case performance of these 
heuristics can be quite poor, in general. 

Theorem 7.1. Given any sufficiently large positive in¬ 
teger n, there exists a threshold T' < Oy/n, for 
some constant a < 1 and a graph of size n for 
which the number of edges removed by ProductDe- 
gree, EigenScore, Hybrid and LinePagerank is 

^{^)c(Eop't). 

The proof is presented in appendix 1A.41 
8 Experiments 

8.1 Methods and Dataset We evaluate the al¬ 
gorithms developed in the papeiH - GreedyWalk, 
GreedyWalkSparse and PrimalDual - and com¬ 
pare their performance with the heuristics from litera¬ 
ture - EigenScore, ProductDegree, LinePager¬ 
ank and Hybrid (described in Section [7|), as a more 
sophisticated baseline. The networks which we con¬ 
sidered in our empirical analysis are listed in Table [5] 
spanning infrastructure networks, social networks and 
random graphs. 

8.2 Experimental results 

Performance of our algorithms and comparison 
with other heuristics: We hrst compare the qual- 

^All code at; http://tmyurl.com/131gsq7. 









Table 2: Networks and their sizes. The first two are 
synthetic random networks; others are taken from [2] and 

m _ 


Network 

nodes 

edges 

Ai 

Barabasi-Albert 

1000 

1996 

11.1 

Erdos-Renyi 

994 

2526 

6.38 

P2P (Gnutella05) 

8846 

31839 

23.55 

P2P (GnutellaOb) 

8717 

31525 

22.38 

Collab. Net (HepTh) 

9877 

25998 

31.03 

Collab. Net (GrQc) 

5242 

14496 

45.62 

AS (Oregon 1) 

10670 

22002 

58.72 

AS (Oregon 2) 

10900 

31180 

70.74 

Brightkite Net 

58228 

214078 

101.49 

Youtube Network 

1134890 

2987624 

210.4 

Stanford Web graph 

281903 

1992636 

448.13 


ity of solution from our algorithms with the Eigen- 
ScoRE, ProductDegree, LinePagerank and Hy¬ 
brid heuristics in Figure [TJ We note that Greedy- 
Walk is consistently better than all other heuristics, 
especially as the target threshold becomes smaller. 
Compared to the EigenSgore, ProdugtDegree and 
LinePagerank heuristics, the spectral radius for the 
solution produced by GreedyWalk, as a function of 
the fraction of edges removed, is lower by at least 10- 
20%. Our improved baseline, the Hybrid heuristic, 
works better than the other heuristics, and comes some¬ 
what close the GreedyWalk in many networks. 

Though PrimalDual gives a significantly better 
approximation guarantee, compared to GreedyWalk, 
it has a much higher running time. Therefore, we only 
evaluate it for one iteration of Algorithm HitWalks. 
Figure [5] shows that PrimalDual is quite close to 
GreedyWalk after just one iteration; we expect run¬ 
ning this algorithm fully would further improve the per¬ 
formance, but additional work is needed to improve the 
running time. 



50 100 150 

#Edges removed 


(a) Collaboration GrQc 



(b) P2P Gnutella-5 


Figure 2: GreedyWalk vs PrimalDual. Each 

plot shows the spectral radius (y-axis) as a function 
of the number of edges removed (x-axis) using the two 
methods. 


Running time and effect of sparsification:. Fig- 
ure[3]shows the total running time of GreedyWalk for 


— Barabasi-Albert 
-^Collab GrQc 



^^.2 0.4 0.6 0.8 1 

T/Max Eigenvalue 


Figure 3: Total running time of GreedyWalk method 
(y-axis) as a function of T/p{G) (x-axis), where T is the 
threshold and p{G) is the spectral radius of the initial 
graph, without any edges removed. 


three networks. The time decreases with the increase of 
r, because the while loop in Algorithm GreedyWalk 
needs to be run for fewer iterations. The high running 
time motivates faster methods. We evaluate the per¬ 
formance of the GreedyWalkSparse algorithm. As 
shown in Figure HI GreedyWalkSparse gives almost 
the same quality of approximation as GreedyWalk, 
but improves the running time by up to an order of 
magnitude, particularly when T is small. 



T/Max Eigenvalue 
(a) :^Edges removed 



T/Max Eigenvalue 
(b) Execution time 


Figure 4: Impact of sparsification on GreedyWalk. 
The plots show for AS Oregon-1 network, (a) the 
number of edges removed and (b) the execution time 
on the y-axis, as a function of T/p[G) (x-axis), where 
T is the threshold and p{G) is the spectral radius of the 
initial graph, without any edges removed. 


Effect of varying walk lengths: As discussed in Sec¬ 
tion [31 the walk length parameter k is critical for the 
performance of GreedyWalk. Figure [S] shows the ap¬ 
proximation quality in the Oregon-2 and collaboration 
networks. We find that as k becomes smaller, the ap¬ 
proximation quality degrades significantly, and the best 
performance occurs at k close to 2 log n. 

Extensions: For the SRME-NONUNIFORM problem, 
we compare the adaptation of GreedyWalk, as dis¬ 
cussed in SectionlSl with the Eigenscore heuristic run 
on the matrix B of transmission rates. As shown in Fig¬ 
ure Hbj we find that GreedyWalk performs much bet¬ 
ter. Next we consider the SRMN problem, and compare 
the GreedyWalk, as adapted in Section [51 with the 











































-^ProductDegree 
-^EigenScore 
— LinePagerank 
-^Hybrid 
—GreedyWalk 




(a) AS Oregon-1 
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(d) P2P Gnutella-5 


(e) P2P Gnutella-6 
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(f) Brightkite 






(g) Portland 


(h) Youtube 


(i) Stanford Web 


Figure 1: Comparison between the GreedyWalk, ProductDegree, Eigenscore, LinePagerank and 
Hybrid algorithms for different networks. Each plot shows the spectral radius (y-axis) as a function of the 
fraction of edges removed (x-axis). The LinePagerank heuristic has not been evaluated in [T^ [Thl and ITil because 
of the scale of these networks. 


node versions of the Degree and EigenScore heuris¬ 
tic m- As shown in Figure I6dl Greedy Walk per¬ 
forms consistently better. For results in other networks, 
see the full version |35| . 

Demographic properties of removed nodes and 
edges: GreedyWalk can also help in getting non¬ 
network surrogates for picking nodes/edges. We ana¬ 
lyzed the demographic properties of the nodes and edges 
removed by GreedyWalk on the Portland contact net¬ 
work [1]. By doing so, we can hope to use such demo¬ 
graphic properties directly, for quicker implementation 
and/or when the entire network is not readily available. 
Figure [3 shows the age groups of the end points of the 
top 1500 selected edges by GreedyWalk as a matrix. 
Age-groups are partitioned according to |26| and shown 
in table [H As the figure shows, the edges among age- 


group ^11 (ages 45 — 49) and with age-groups 7/8 (age 
30—34) and ^17 (age 75-I-) are picked to a greater extent 
by GreedyWalk. We observe that the edges picked 
by GreedyWalk have substantially different proper¬ 
ties compared to other heuristics . Figure |8] shows the 
age groups of the nodes removed by the GreedyWalk 
algorithm for the SRMN problem, along with the age 
group distribution of the entire population. Observe 
that more people are selected in age-group numbers 7 
to 11 which correspond to ages 25-49. 

Main observations: 

1. GreedyWalk performs consistently better than 
existing heurisitics in removing nodes or edges in both 
static and variable transmission rate settings. 

2. Sparsification helps in improving the speed of 
GreedyWalk without effecting the solution quality. 










































































(a) AS Oregon-2 


(b) Collaboration GrQc 


Figure 5: Impact of walk length on GreedyWalk 
performance. Each plot shows the drop in spectral 
radius (y-axis) with number of edges removed (x-axis), 
for different values of k, ranging from 2 to 21ogu, for 
the corresponding networks. 


Table 3: Age-groups [2^ 


Age-group 

1 

Age 

II 

Age-group 

1 

Age 

1 

1 

0 

II 

10 

1 

40-44 

2 

1 

1-4 

II 

11 

1 

45-49 

3 

1 

5-9 

II 

12 

1 

50-54 

4 

1 

10-14 

II 

13 

1 

55-59 

5 

1 

15-19 

II 

14 

1 

60-64 

6 

1 

20-24 

II 

15 

1 

65-69 

7 

1 

25-29 

II 

16 

1 

70-74 

8 

1 

30-34 

II 

17 

1 

75+ 

9 

1 

35-39 

II 


1 



3. GreedyWalk performs best for walk-lengths oi k = 
2 logn. 

4. GreedyWalk can potentially help in picking more 
accurate non-network surrogates. 

9 Related Work 

Related work comes from multiple areas: epidemiology, 
immunization algorithms and other optimization algo¬ 
rithms. There is general research interest in studying 
dynamic processes on large graphs, (a) blogs and prop¬ 
agations [T71 [2^ , (b) information cascades [TSl [T6] and 
(c) marketing and product penetration [34]. These dy¬ 
namic processes are all closely related to virus propaga¬ 
tion. 

Epidemiology: A classical text on epidemic models 
and analysis is by May and Anderson |5| . Most work in 
epidemiology is focused on homogeneous models 
Here we study network based models. Much work 
has gone into in finding epidemic thresholds (minimum 
virulence of a virus which results in an epidemic) for a 
variety of networks [Ml [Ml (H1311 ■ 
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(a) SRME-nonuniform: (b) SRME-nonuniform: 
Barabasi-Albert Collaboration GrQc 



-^Degree 

—EigenScore 

—GreedyWalk 

L. 


.. 


°0 0.2 0.4 0.6 
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GrQc 


Figure 6: Computing solutions for SRME- 

NONUNIFORM (I6al6b|) and SRMN (|6cl6dl) problem on 
different networks with GreedyWalk algorithm and 
Degree and EigenScore heuristics as adapted in Sec¬ 
tion |B) The plots show the resultant spectral radius (y- 
axis) as fractions of edges/nodes are removed (x-axis) 
with different methods. 


Immunization: There has been much work on finding 
optimal strategies for vaccine allocation [HESKn]- Co¬ 
hen et al m studied the popular acquaintance immu¬ 
nization policy (pick a random person, and immunize 
one of its neighbors at random). Using game theory, 
Aspnes et al. |5] developed inoculation strategies for vic¬ 
tims of viruses under random starting points. Kuhlman 
et al. m studied two formulations of the problem of 
blocking a contagion through edge removals under the 
model of discrete dynamical systems. As already men¬ 
tioned Tong et al. [381 [37] . Van Miegham et al. |28| . 
Prakash et al. [30] and Chakrabarti et al. |9] proposed 
various node-based and edge-based immunization algo¬ 
rithms based on minimizing the largest eigenvalue of the 
graph. Other non-spectral approaches for immunization 
have been studied by Budak et al |8], He et al [T8] and 
Khalil et al. |20| . 

Other Optimization Problems: Other diffusion 
based optimization problems include the influence max¬ 
imization problem, which was introduced by Domin¬ 
gos and Richardson |33| . and formulated by Kempe et. 
al. |19| as a combinatorial optimization problem. They 
proved it is NP-Hard and also gave a simple 1 — 1/e 
approximation based on the submodularity of expected 
spread of a set of starting seeds. Other such problems 
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(c) ProductEigenscore 


Figure 7: Age-Group 
edges 



Age Group 

(d) Hybrid 


of the top 1500 removed 



(a) Removed Nodes 


Figure 8: Age-group of 1500 removed nodes with 

GreedyWalk from Portland contact graph. 


where we wish to select a subset of ‘'important' vertices 
on graphs, include ‘outbreak detection’ [ 23 ] and ‘finding 
most-likely culprits of epidemics’ [251 132| . 

10 Conclusions 

We study the problem of reducing the spectral radius of 
a graph to control the spread of epidemics by removing 
edges (the SRME problem) or nodes (the SRMN prob¬ 
lem). We have developed a suite of algorithms for these 
problems, which give the first rigorous bounds for these 
problems. Our main algorithm GreedyWalk performs 
consistently better than all other heuristics for these 
problems, in all networks we studied. We also develop 
variants that improve the running time by sparsifica- 
tion, and improve the approximation guarantee using a 
primal dual approach. These algorithms exploit the con¬ 
nection between the graph spectrum and closed walks in 


Figure 9: ((2al) Age-Group matrix of the top 1500 re¬ 

moved edges and (I9bl) Age-group of 1500 removed nodes 
with GreedyWalk from Portland contact graph. 


the graph, and perform better than all other heuristics. 
Improving the running time of these algorithms is a di¬ 
rection for further research. We expect these techniques 
could potentially help in optimizing other objectives re¬ 
lated to spectral properties, e.g., robustness uni, and in 
other problems related to the design of interventions to 
control the spread of epidemics. 
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A Appendix 

A.l Greedy Walk with Dynamic Program¬ 
ming Approach 

Main idea: we adapt a dynamic programming ap¬ 
proach in sparse graphs to avoid matrix multiplication, 
that leads to lower space complexity, thereby allowing 
us to scale to larger graphs. We then observe that the 
number of walks does not need to be recomputed each 
time an edge is deleted. 

Let H:^{G,x,l) denote the number of walks of 
length I from node u through edge (u, v) as the first edge 
to node x in G. It is easy to see that, Hjj^{G,u,k) = 
walks(e, G, fc). Algorithm ClosedWalkDP describes 
how to compute H^j;^{G,u,l) = walks(e, G, fc). In the 
algorithm, N{x) denotes the neighbors of node x in G. 


Algorithm 5 ClosedWalkDP(G, {u,v),k) 

Input: G,{u,v),k>2 

Output: Number of closed walks of length fc in G 
containing (u, n) 

1: Let iL^(G,u, 1) = 1, iL^(G,x, 1) = 0, Vx e P\{w} 
2 : for I = 2 to k do 

3: H:^{G, X, 1) = J2y^N{x) H^{G, y, Z — 1), Vx G P 

4: end for 

5: return iLj^(G, M, fc) 


Next, we describe in Algorithm GreedyEdge- 
Choice how the greedy edge choice in line 4 of Al¬ 
gorithm GreedyWalk is implemented efficiently. We 
make use of the fact that walks(e, G', fc) < walks(e, G, fc) 
for any G' C G. In every iteration of Algorithm 
GreedyEdgeGhoice, potentially, we need to update 
/(•) for all edges in E\E'. However, in practice, we 
observe that the number of such updates is very small 
compared to |E \ E'|. 


Algorithm 6 GreedyEdgeChoice 

Input: G,T, c(-) 

Output: Edge set E' 

1 : Initialize E' (j) and Ve G E, let /(e) = 
walks(e, G, fc) //computed by ClosedWalkDP 
2 : while WkiG[E \ E']) > nT^ do 
3: Order edges oi E \ E' in the decreasing order of 

/(.) values. Let ei be the first edge. 

4: E' ^ E'U {ei} 

5: for/ = 2, ...,|E\E'| do 

6 : Update f{ej) = walks(ej, G[E \ E'], fc). 

7: if fiej) > f iej+i) then 

8 : Exit from the for loop 

9: end if 

10; end for 
11: end while 


Running time and space complexity: Let n = \V\, m = 
\E\. Note that, ClosedWalkDP(G, e, fc) takes 2mk 
time to compute walks(e, G, fc). Therefore, computing 
walks(e,G, fc) for all the edges takes 2m^fc = 0{n^k), 
assuming m = 0(n) in real world networks. Since, 
for computing H:^{G, x, 1), Vx G V, ClosedWALKE( fc) 
needs to look only at H^viG, y, 1 — 1), Vy G V, therefore, 
the space complexity is 0 (n). 


A.2 Non-uniform transmission rates 

Let B = {Pij) denote the matrix of the transmission 
rates. We assume the rates are symmetric, i.e., /3y = 
Pji. In this case, the sufficient condition for the epidemic 
to die out is slightly different, and is stated below. 


Lemma A.l. Let B he the matrix of transmission rates, 
and let 5 be the recovery rate in the SIS model. If 
p{B) < S, the time to extinction, r satisfies 


Exp[t] < 


log n + \ 

5 - P{B) 


For the case of uniform costs, i.e., c(e) = 1 for all 
edges e, this motivates the following problem: 

Definition A.l. SRME-nonuniform proWem Given 
an undirected graph G = {V,E), with transmission rate 
Pij for each (i,j) G E and recovery rate 5, find the 
smallest set E' G E such that p{B[G[E — E'])) < S. 

In this section, we use Eqpt to denote the opti¬ 
mum solution to SRME-NONUNIFORm(G, B, S). Our al¬ 
gorithm GreedyWalk-nonuniform adapts Greedy¬ 
Walk to a weighted covering problem. We need to re¬ 
fine the definitions used earlier. For walk w G >Vfe(G), 
let f{w) = Y{e={ij)&E{w) denote its weight, 

where count(e,w) is the number of occurrences of edge 
e in walk w, for a set W of walks, let f{W') = 
J2weW' /(^) denote the total weight of W. In the 
algorithm, we will need to compute f{Wk{G)), which 
is done by modifying the recurrence used in Algorithm 
CountWalks(G) to compute Wk{G): 

/(Wfe(G)) = El + /(Wfc(G[P - {n}]). 


Let /(e, G) = Y.w.eew /(^) denote the total weight 
of walks containing edge e; /(e, G) = B^. Algo¬ 
rithm GreedyWalk-nonuniform involves the follow¬ 
ing steps: 

• E' = (j) 


• while f{Wk{G[E — E'])) > nS: 

— Pick the e G E\E' that maximizes (min{n(5 — 
fiWk{G[E - E'])),f{e, G[E \ E'])})/c{e). 











- E' ^E'yj {e} 

Lemma A.2. Let E' denote the set of edges found by 
Algorithm GREEDYWalk-NONUNIFORM. Given any 
constant e > 0, let k be an even integer greater than 
logn/ log(l + e/3), we have p{B{G[E \ E'])) < (1 + e)(5 
and |c(£’')| = 0(c(i5opT) lognlog A). 

Proof. The bound on p{B{G[E \ A'])) follows on the 
same lines as the proof of Lemma 13.11 The main 
difference is that the proof of [3^ does not consider 
the case of weights associated with elements. But, 
as we argue now, the same approach for analyzing 
greedy algorithms extends to our case, and we show 
c{E') = 0(c(i?HiTOPT) logn). 

We partition the iterations of Algorithm 
GreedyWalk-nonuniform into O(logn) phases. 
Each phase, ends at the first iteration when the 
total weight that needs to be further covered goes 
down by a factor of at least 2. So if E is the 
weight that needs to be covered at the start of the 
phase, in every iteration of the phase, there exists an 
edge e (which is in an optimum solution) such that 
fie,G[E\E'])/cie) > E/(2c(£;hitopt)). Thus, the 
total cost of the edges selected in the phase is at most 
2c(£’hitopt)- Since the ratio of nS over the minimum 
weight of a walk is polynomial in n, the total number of 
phases is O(logn). Adding over all phases then yields 
the desired bound on c{E'). Putting this together with 
the rest of the proof of Lemma 13.11 yields the desired 
bound. 

A.3 Node version: SRMN problem 

Recall the definition of walks(u, G, k) from Section 
[21 Let G[R"] denote the subgraph of G = (R, E) 
induced by subset V" C V. We modify Algorithm 
GreedyWalk to work for the SRMN problem in the 
following manner: 


Algorithm 7 Algorithm GreedyWalkSRMN 
1 : Initialize V' 4> 

2 : while Wk{G\V \ V'\) > nT^ do 
3: r ^Wk\G[E\E'])-nT^ 

4: Pick V G V \ V that maximizes 

min{r,walks(L!,G[V'\V^],fc)} 

c(v) 

5: R' t- W U {u} 

6: end while 


A.4 Proof of Theorem 17.11 

Construction: We construct a graph G for which the 
statement holds. For convenience let us assume that 
T' is a positive integer. G contains (1) a clique Gi on 
T' + 1 nodes; (2) a caterpillar tree G 2 , which comprises 
of a path V 1 V 2 ■ ■ ■ Vq-i with Vi adjacent to T' leaves each 
and (3) G 3 , a star graph with (T'+l)^ leaves and central 
vertex denoted by Vq. We connect Gi to G 2 by (woi^i) 
where, Vq is some node in Gi and G 2 is connected to G 3 
by the edge (vq,Vq-i). Note that q = and 

Ai(G) > Ai(G 3 ) =T' + 1. Again, here we assume that 
q is an integer. 

Bound on c(£'opt): We will show that c(£'opt) A 
2T' + 3. Removing the edges (uo,ui) and {vq-i,Vq) 
isolates the components Gi, G 2 and G 3 . Gi is a 
clique on r' + 1 nodes and on removing one edge, its 
spectral radius decreases below T'. G 2 is a star with 
(T' + 1)^ leaves and therefore, on removing at most 
(T' + 1 )^ — (T'^ + 1 ) edges, its spectral radius decreases 
below T'. It can be shown that Ai(G 2 ) < + 2. 

Now we will demonstrate that all the four algo¬ 
rithms score the edges {vi, Ui+i), i = 0,... ,q — 2 above 
any edge belonging to the clique Gi. However, the 
spectral radius cannot be brought down below T' un¬ 
til at least one edge in Gi is removed. Therefore, at 
least q edges will be removed by all the algorithms. By 
the initial assumption that T' < it follows that 

q = H(^), while, by c(Aopt) = 0{T'), hence com¬ 
pleting the proof. Now we analyze each algorithm sep¬ 
arately. 

ProductDegree: For all u e R(Gi), d{u) <T' + 1 
while, for each i = 1,... ,q, d(vi) >T' + 2. Therefore, 
(ui, Ui+i), i = 0 ,..., g — 2 has higher score than any edge 
in Gi. 

EigenScore: Let x denote the unit eigenvector 
corresponding to Ai(G) and for any v G R(G), let x{v) 
denote the uth component of x. We will show that 
x(yq-i) > x{vq- 2 ) > ■ > x{vq) > x{v') where v' is 

any vertex in Gi other than vq. This implies that all 
the edges (vi,Vi+i), i = 0 , ...,g — 2 have eigenscore 
greater than the edges in Gi. 

Let A := Ai(G). By symmetry, all v' G R(Gi)\{uo} 
have the same eigenvector component x{v') and all 
leaves of Vi have the same component x{li). Let A be 
the adjacency matrix of G. Since Ax = Ax, we have 


It can be shown on the same lines as Lemma [3.1l that 
this gives a solution of cost 0(c(£’opt(T)) lognlog A), 
where c(Eopt(T)) denotes the cost of the optimal 
solution to SRMN problem. Further, the same running 
time bounds as in Sections 13.II and |3.21 hold. 








(A. la) 

Xx{v') = (T' — l)a;(z)') + x(vo) 

(A.lb) 

Aa;(i'o) = T'x{v') + x{vi) 

(A.lc) 

Xx{v^) = x{vi_i) + iK+i) + T'x{li), l<i<q-l 
(A. Id) 

Xx{Vq) = x{Vq-l) + (T' + l)'^x{lq) 

(A.le) 

Xx{li) = x{vi), 1 <i <q. 

From (|A.lal) and the fact that A > T' + 1, 

(A.2) x{vo) = (A — T' + l)a:(ti') > 2x{v'). 

By induction on i, we will show that x{vi) > '^x{vi-i) 
for z = — 1. The base case is z = 1. Us¬ 

ing (lA.lbl) . (Ia' 2I) and the bound A > T' -|- 1, 

(A.3) x{yi) = Xx{vo) - T'x{v') >> ^x{vo) ■ 

Assuming x{vi) > '^x{vi-i) and apply¬ 

ing (lA.lcI) , (|A.leP and again A > T' -|- 1, 

(A.4) 

x{vi+i) = Xx{vi) - x{vi-i) - T'x{li) 

(a. 5) > 1 ^ “ y) 

(A.6) >[T' + ^-^-—r^^x{vi)>'^x{vi). 

From (IA.2I) and (IA.4|) . it follows that x{vq-i) > 
x{Vq^2) > ■ ■ ■ > x{vq) > X{v'). 

Hybrid: Since both ProductDegree and Eigen- 
Score rate edges {vi,Vi+i), i = 0, — 2, higher 

than any edge in Gi, it follows that the same holds for 
Hybrid as well. 

LinePagerank: Let 7r(e) denote the pagerank of 
edge e. We will show that TT(vq-iVq) = 'n(vq- 2 Vq-i) = 

• •• = tt{viV2) > tt{voVi) > 7r(e))p) > 7r(e'^) where 
7r(e°jj) (by symmetry) is the pagerank of every edge 
in clique Gi incident with vq while 7r(e°) (again by 
symmetry) is the pagerank of every other edge in the 
clique. Let h denote the leaf edges incident with Vi 
for z = 1,... ,q. Pagerank of each edge is computed 
as follows: 7r(e) = J2e'eN{e) where, N{e) and d{e) 
denote the set of neighbors and degree respectively of e 
in the line graph. 


In the line graph, the degrees of each edge of G are 
as follows: d{e'^) = 2 (T'— 1 ); d{e^^) = 2T' —1; d(vozii) = 
2T' + 1; divq-ivq) = (r + 1)2 + 1; d{v^v^+i) = 2{r + 
1), z = 1,..., g — 2; d(li) = T' -I-1, z = 1,..., g — 1. The 
pageranks of the relevant edges are as follows: 


(A.7a) 

7r(e°) 

(A.7b) 

^'■(eSo) 

(A.7c) 

■^{voVi) 

(A.7d) 

7r(z;iz;2) 

'K{viVi+i) 

(A.7e) 

(A.7f) 

TT{h) 

T^{k) 

(A.7g) 


2{T' - 1) - 2 
2(T' - 1) 


7r(e'=) 


27r(e(;J 
2(T' - 1) + 1 


2 


T' - 1 
2T'- 1 


KeSJ 


7r(z;oUi) 

2 r'-ki 


^'7r(egJ 7r(z;iz;2) rV(/i) 

2T' - 1 2(T' -f 1) T' + I 

■k{vqVi) tt{v2V3) T{tt{Ii) +tt{12)) 

2T'+ 1^ 2iT'+1)^ T'+l 
Tr{v^_lVi) + TT{Vi+lV^+2) T'{'K{li) + 7r(^j+l)) 
2{T' -hi) T' + I 


z = 2,...,g-2 


T' - 1 7r(i;oz;i) 7r(z;iU2) 

-TTlti ) “T -~r —7-r 

T'-hl ^ ^ 2r' + l 2(T'-hl) 

T' -1 ■n{vi_iv{) + ^{viv^+i) 

y/-^l^Uj+ 2{T' + 1) 

z = 2, ...,g-2. 


Using (IA.7I) . we have the following: 


(A. 8 a) 

dAJil ^ 7r(eS„) = 

(A.8b) 

2T' -h 1 

(|A.7bp and (|A.8ap 7r(z;oz;i) = _ y (egj, 

(A. 8 c) 

2 (r'-i-i) 

(IA.7cp , (|A.7f|) and (|A.8bp =h 7r(z;iz;2) = yyyy 
(A.8d) 

(IA.7dD . (IA.7fl) . ( |A.7gP and (lA.ScD ^ 71 ( 112113 ) = 7r(z;iz;2). 

Now, by induction on z we can show that '^{viVi+i) = 
'iT{vi-iVi), for z = 2,..., g — 2. The base case z = 1 is 
covered in (|A. 8 dl) . For any k >2, applying 7 r(z;fcz;fe_|_i) = 
TT{vk-iVk) in (IA.7dl) (with i = k) and ( |A.7gP , it follows 
that 7 r(z;fc+iz;fe+ 2 ) = Tr{vkVk-i)- 
Hence, proved. 


































































